404 research outputs found

    An Ontology-Based Artificial Intelligence Model for Medicine Side-Effect Prediction: Taking Traditional Chinese Medicine as An Example

    Get PDF
    In this work, an ontology-based model for AI-assisted medicine side-effect (SE) prediction is developed, where three main components, including the drug model, the treatment model, and the AI-assisted prediction model, of proposed model are presented. To validate the proposed model, an ANN structure is established and trained by two hundred and forty-two TCM prescriptions. These data are gathered and classified from the most famous ancient TCM book and more than one thousand SE reports, in which two ontology-based attributions, hot and cold, are introduced to evaluate whether the prescription will cause SE or not. The results preliminarily reveal that it is a relationship between the ontology-based attributions and the corresponding predicted indicator that can be learnt by AI for predicting the SE, which suggests the proposed model has a potential in AI-assisted SE prediction. However, it should be noted that, the proposed model highly depends on the sufficient clinic data, and hereby, much deeper exploration is important for enhancing the accuracy of the prediction

    LM-VC: Zero-shot Voice Conversion via Speech Generation based on Language Models

    Full text link
    Language model (LM) based audio generation frameworks, e.g., AudioLM, have recently achieved new state-of-the-art performance in zero-shot audio generation. In this paper, we explore the feasibility of LMs for zero-shot voice conversion. An intuitive approach is to follow AudioLM - Tokenizing speech into semantic and acoustic tokens respectively by HuBERT and SoundStream, and converting source semantic tokens to target acoustic tokens conditioned on acoustic tokens of the target speaker. However, such an approach encounters several issues: 1) the linguistic content contained in semantic tokens may get dispersed during multi-layer modeling while the lengthy speech input in the voice conversion task makes contextual learning even harder; 2) the semantic tokens still contain speaker-related information, which may be leaked to the target speech, lowering the target speaker similarity; 3) the generation diversity in the sampling of the LM can lead to unexpected outcomes during inference, leading to unnatural pronunciation and speech quality degradation. To mitigate these problems, we propose LM-VC, a two-stage language modeling approach that generates coarse acoustic tokens for recovering the source linguistic content and target speaker's timbre, and then reconstructs the fine for acoustic details as converted speech. Specifically, to enhance content preservation and facilitates better disentanglement, a masked prefix LM with a mask prediction strategy is used for coarse acoustic modeling. This model is encouraged to recover the masked content from the surrounding context and generate target speech based on the target speaker's utterance and corrupted semantic tokens. Besides, to further alleviate the sampling error in the generation, an external LM, which employs window attention to capture the local acoustic relations, is introduced to participate in the coarse acoustic modeling

    Delivering Speaking Style in Low-resource Voice Conversion with Multi-factor Constraints

    Full text link
    Conveying the linguistic content and maintaining the source speech's speaking style, such as intonation and emotion, is essential in voice conversion (VC). However, in a low-resource situation, where only limited utterances from the target speaker are accessible, existing VC methods are hard to meet this requirement and capture the target speaker's timber. In this work, a novel VC model, referred to as MFC-StyleVC, is proposed for the low-resource VC task. Specifically, speaker timbre constraint generated by clustering method is newly proposed to guide target speaker timbre learning in different stages. Meanwhile, to prevent over-fitting to the target speaker's limited data, perceptual regularization constraints explicitly maintain model performance on specific aspects, including speaking style, linguistic content, and speech quality. Besides, a simulation mode is introduced to simulate the inference process to alleviate the mismatch between training and inference. Extensive experiments performed on highly expressive speech demonstrate the superiority of the proposed method in low-resource VC.Comment: Accepted by ICASSP 202

    Beamforming Designs and Performance Evaluations for Intelligent Reflecting Surface Enhanced Wireless Communication System with Hardware Impairments

    Full text link
    Intelligent reflecting surface (IRS) can effectively control the wavefront of the impinging signals, and has emerged as a promising way to improve the energy and spectrum efficiency of wireless communication systems. Most existing studies were conducted with an assumption that the hardware operations are perfect without any impairment. However, both physical transceiver and IRS suffer from non-negligible hardware impairments in practice, which will bring some major challenges, e.g., increasing the difficulty and complexity of the beamforming designs, and degrading the system performance. In this paper, by taking hardware impairments into consideration, we make the transmit and reflect beamforming designs and evaluate the system performance. First, we utilize the linear minimum mean square error estimator to make the channel estimations, and analyze the factors that affect estimation accuracy. Then, we derive the optimal transmit beamforming vector, and propose a gradient descent method-based algorithm to obtain a sub-optimal reflect beamforming solution. Next, we analyze the asymptotic channel capacities by considering two types of asymptotics with respect to the transmit power and the numbers of antennas and reflecting elements. Finally, we analyze the power scaling law and the energy efficiency. By comparing the performance of our proposed algorithm with the upper bound on the performance of global optimal reflect beamforming solution, the simulation results demonstrate that our proposed algorithm can offer an outstanding performance with low computational complexity. The simulation results also show that there is no need to cost a lot on expensive antennas to achieve both high spectral efficiency and energy efficiency when the communication system is assisted by an IRS and suffer from hardware impairments.Comment: arXiv admin note: text overlap with arXiv:2004.09804, arXiv:2004.0976
    corecore